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1. Introduction 

This paper deals with belief models for both finite and countable sequences of exchange- 
able random variables taking a finite number of values. When such sequences of random 
variables are assumed to be exchangeable, this more-or-less means that the specific order 
in which they are observed is deemed irrelevant. 

The first detailed study of exchangeability was made by de Finetti [5] (with the termi- 
nology of 'equivalent' events). He proved the now famous representation theorem, which 
is often interpreted as stating that a sequence of random variables is exchangeable if it is 
conditionally independent and identically distributed (i.i.d.). Other important work on 
exchangeability was done by, amongst many others, Hewitt and Savage [12], Heath and 
Sudderth [10], Diaconis and Freedman [8] and, in the context of the behavioural theory 
of imprecise probabilities that we are going to consider here, by Walley [19]. We refer to 
Kallenberg [14, 15] for modern, measure-theoretic discussions of exchangeability. 

One of the reasons why exchangeability is deemed important, especially by Bayesians, 
is that, by virtue of de Finetti's representation theorem, an exchangeable model can be 
seen as a convex mixture of multinomial models. This has lent some support [2, 5, 7] to 
the claim that aleatory probabilities and i.i.d. processes can be eliminated from statistics 
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and that we can restrict ourselves to exchangeable sequences instead; see Walley [19], 
Section 9.5.6 for a critical discussion of this claim. 

De Finctti presented his study of exchangeability in terms of the behavioural notion 
of previsions, or fair prices. The central assumption underlying his approach is that a 
subject should be able to specify a fair price P{f) for any risky transaction (which we 
will call a gamble) f ([7], Chapter 3). This may not always be realistic, so it has been 
suggested that we should explicitly allow for a subject's indecision, by distinguishing 
between his lower prevision P{f), which is the suprcmum price for which he is willing to 
buy the gamble /, and his upper prevision P{f), which is the infimum price for which he 
is willing to sell /. For any real number r strictly between and P{f), the subject is 
then not specifying a choice between selling or buying the gamble / for r. Such lower and 
upper previsions are also subject to certain rationality or coherence criteria, in very much 
the same way that (precise) previsions are, in de Finetti's account. The resulting theory 
of coherent lower previsions, brilliantly defended by Walley [19], generalises de Finetti's 
behavioural treatment of subjective, epistemic probability and is briefly overviewed in 
Section 2. 

Also, in this theory, it is interesting to consider the consequences of a subject's ex- 
changeability assessment, that is, that the order in which we consider a number of ran- 
dom variables has no impact. This is our motivation for studying exchangeable lower 
previsions in this paper. An assessment of exchangeability will have a clear impact on 
the structure of so-called exchangeable coherent lower previsions. We will show that such 
a prevision can be written as a combination of (i) a coherent (linear) prevision expressing 
that permutations of realisations of such sequences are considered equally likely, and (ii) 
a coherent lower prevision for the 'frequency' of occurrence of the different values the 
random variables can take. Of course, this is the essence of representation in de Finetti's 
sense - we generalise his results to coherent lower previsions. 

Before we go on, we want to draw attention to a number of distinctive features of our 
approach. First, the usual proofs of the representation theorem, such as the ones given 
by de Finetti [5], Heath and Sudderth [10] and Kallenberg [15], do not lend themselves 
very easily to generalisation in terms of coherent lower previsions. In principle, it would 
be possible, at least in some cases, to start with the versions already known for (precise) 
previsions and to derive their counterparts for lower previsions using so-called lower 
envelope theorems; see Section 2 for more details. This is the method that Walley [19], 
Sections 9.5.3 and 9.5.4, suggests. However, we have decided to follow a different route: 
we derive our results directly for lower previsions, using an approach based on Bernstein 
polynomials, and we obtain the ones for previsions as special cases. We believe this 
method to be more elegant and self-contained, and it certainly has the additional benefit 
of drawing attention to what we feel is the essence of de Finetti's representation theorem: 
specifying a coherent belief model for a countable exchangeable sequence is tantamount 
to specifying a coherent (lower) prevision on the linear space of polynomials on some 
simplex, and nothing more. 

Second, we will focus on - and use the language of - (lower and upper) previsions 
for gambles, rather than (lower and upper) probabilities for events: in the behavioural 
theory of imprecise probabilities, the language of gambles is much more expressive than 
that of events and we need its full expressive power to derive our results. 



Exchangeable lower previsions 



723 



The paper is organised as follows. In Section 2, we introduce a number of results from 
the theory of coherent lower previsions necessary to understand the rest of the paper. In 
Section 3, we define exchangeability for finite sequences of random variables and establish 
a representation of coherent exchangeable lower previsions in terms of sampling without 
replacement. In Section 4, we extend the notion of exchangeability to countable sequences 
of random variables and in Section 5 we generalise dc Finctti's representation theorem 
(in terms of multinomial sampling) to exchangeable coherent lower previsions. In the 
Appendix, we have gathered a few useful results about Bernstein polynomials. 

2. Lower previsions, random variables and their 
distributions 

In this section, we provide a brief summary of ideas and results from the theory of 
coherent lower previsions [19]. 

2.1. Epistemic uncertainty models 

Consider a random variable X that may assume values x in some non-empty set X. By 
'random', we mean that a subject is uncertain about the actual value of the variable X , 
that is, docs not know what this actual value is. 

Our subject may entertain certain beliefs about the value of X. We try and model his 
beliefs mathematically using the concept of a gamble on X, which is a bounded map / 
from X to the set M of real numbers. We denote by C{X) the set of all gambles on X. 

De Finetti [7] proposed the modelling of a subject's beliefs by eliciting his fair price, 
or prevision, P{f) for certain gambles /. This P{f) can be defined as the unique real 
number p such that the subject is willing to buy the gamble / for all prices s (that is, 
accept the gamble / — s) and sell / for all prices t (that is, accept the gamble t — f) for 
all s < p < t. The problem with this approach is that it presupposes that there is such a 
real number, or, in other words, that the subject, whatever his beliefs about X are, is 
willing, for (almost) every real r, to make a choice between buying / for the price r or 
selling it for that price. 

2.2. Coherent lower previsions and natural extension 

A way to address this problem is to consider a model that allows our subject to be unde- 
cided for some prices r. This is done in Walley's [19] theory of lower and upper previsions. 
The lower prevision of the gamble /, P.{f), is our subject's suprcmum acceptable buy- 
ing price for /; similarly, our subject's upper prevision, P{f), is his infimum acceptable 
selling price for /. Hence, he is willing to buy the gamble / for all prices s < P_{f) and 
sell / for all prices t > P{f), but he may be undecided for prices P_{f) <p < P{f)- 

Since buying the gamble / for a price s is the same as selling the gamble — / for the 
price — s, the lower and upper previsions are conjugate functions: P_{f) — —P{—f) for 



724 



G. de Cooman, E. Quaeghebeur and E. Miranda 



any gamble /. This allows us to concentrate on one of them since we can immediately 
derive results for the other. In this paper, we focus mainly on lower previsions. 

The lower probability P,{A) of an event AC X is defined as the lower prevision of its 
indicator Ia- P.{A) = P_{Ia)] I a is the gamble that assumes the value one on A and zero 
elsewhere. For the upper probability P{A) of A, we similarly have that P{A) = P{Ia)- 

For lower previsions, the most important rationality criterion is that of coherence. If 
a lower prevision P is defined on a linear space of gambles /C, then it turns out to be 
coherent if and only if it satisfies the following conditions. For any gambles / and g in 
/C and any non-negative real number A, it should hold that: 

(PI) P_{f) > inf / [accepting sure gains]; 

(P2) £(A/) = AF(/) [non-negative homogeneity] ; 

(P3) P{f + g)>Pif)+Pig) [superadditivity]. 

The following special properties hold for a coherent lower prevision whenever the gambles 
involved belong to its domain: 

(i) P is monotone, that is, if / < then P{f) < P{g)', 

(ii) inf/<P(/)<P(/)<sup/. 

Moreover, coherent lower and upper previsions are continuous with respect to uniform 
convergence of gambles. 

2.3. Linear previsions 

If the lower prevision P_{f) and the upper prevision P{f) for a gamble / happen to 
coincide, then the value P{f) = P_{f) = P{f) is called the subject's (precise) prevision 
for /. Previsions arc fair prices in de Finetti's [7] sense. We shall call them precise 
probability models and lower previsions will be called imprecise. 

A prevision on the set C{X) of all gambles is linear if and only if it is a positive 
(/>0^P(/)>0) and normed (P(l) = 1) real linear functional. A prevision on a 
general domain is linear if and only if it can be extended to a linear prevision on all 
gambles. We shall denote by ¥{X) the set of all linear previsions on C{X). 

There is an interesting link between precise and imprecise probability models, expressed 
via the so-called lower envelope theorem as follows. A lower prevision P_ on some domain 
/C is coherent if and only if it is the lower envelope of some set of linear previsions and, in 
particular, of the convex set M{P) of all linear previsions that dominate it: for all / in 
^, Eif) - inf{P(/) : P e M{P)}, where M{P) := {P G F{X) : (V/ £ /C)(P(/) > £(/))}. 

2.4. The distribution of a random variable 

We call a subject's coherent lower prevision P on C{X), modelling his beliefs about the 
value that a random variable X assumes in the set X , his distribution for that random 
variable. 
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If we now consider another set y and a map ip from X to 3^, then we can consider 
Y := f{X) as a random variable assuming values in y. With a gamble h on y, there 
corresponds a gamble h o ip on X whose lower prevision is o ip). This leads us to 
define the distribution of y = (p{X) as the induced coherent lower prevision Q on C(y), 
defined by 

Q{h):=P{ho,p), heC{y). 

This notion generalises that of an induced probability measure. 

Finally, consider a sequence of random variables X„, all taking values in some metric 
space S. Denote by C{S) the set of all continuous gambles on S. For each random vari- 
able Xn, we have a distribution in the form of a coherent lower prevision Px„ on ^{S). 
We then say that the random variables converge in distribution if for all h £ C{S), the 
sequence of real numbers W converges to some real number, which we denote by 
P{h). The limit lower prevision P on C{S) that we can define in this way is coherent, 
because a pointwise limit of coherent lower previsions always is. 

3. Exchangeable random variables 

We are now ready to recall Walley's [19], Section 9.5, notion of exchangeability in the 
context of the theory of coherent lower previsions. We shall see that it generalises de 
Finctti's definition for linear previsions [5, 7]. 

3.1. Definition and basic properties 

Consider > 1 random variables Xi , . . . , X^ taking values in a non-empty and finite set 
X. A subject's beliefs about the values that these random variables X = {Xi, . . . ,Xis[) 
assume jointly in X'^ is given by their (joint) distribution, which is a coherent lower 
prevision defined on the set C{X^). 

Let us denote by Vn the set of all permutations of {1, ... , N}. With any such permuta- 
tion TT, we can associate, by the procedure of lifting, a permutation of X^ , also denoted 
by TT, that maps any x = {xi, . . .,xn) in X'^ to ttx := (a;7r(i) , . . • , 2;7r(W))- Similarly, with 
any gamble / on X^ , we can consider the permuted gamble tt/ / o tt. 

A subject judges the random variables Xi, . . . ,Xn to be exchangeable when he is 
disposed to exchange any gamble / for the permuted gamble tt/, meaning thatP^(7r/- 
/) > 0, for any permutation tt. Taking into account the properties of coherence, this 
means that 

^^'^a - TT/) = - TT/) = - /) = - /) = 

for all gambles / on X^ and all permutations tt in Pn- In this case, we also call the 
joint coherent lower prevision Pf^ exchangeable. A subject will make an assumption of 
exchangeability when there is evidence that the processes generating the values of the 
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random variables are (physically) similar [19], Section 9.5.2, and consequently the order 
in which the variables are observed is not important. 

When is, in particular, a linear prevision , exchangeability is equivalent to 
having P'^ {irf) = P^ (f) for all gambles / and all permutations tt. The following propo- 
sition, mentioned by Walley [19], Section 9.5, and whose proof is immediate and therefore 
omitted, establishes an even stronger link between Walley's and de Finetti's notions of 
exchangeability. 

Proposition 1. A coherent lower prevision P^ is exchangeable if and only if all the 
linear previsions P^ in M.{P^) are exchangeable. 

Clearly, if Xi^ . . . ,Xm are exchangeable, then any permutation X^^i^, . . . ,X^(^j^) is 
also exchangeable and has the same distribution P^ . Moreover, any selection of 1 < 
n < N random variables from amongst the Xi,. . . ,Xm are exchangeable too and their 
distribution is given by P" , which is the A'"-marg inal of P^, defined by P"(/) P^(/) 
for all gambles / on A"", where the gamble / on is the cylindrical extension of / 
to , given by /(zi, ...,zm):= /(zi, . . . for aU (zi, ...,zm) in X^ . 

3.2. Count vectors 

Interestingly, exchangeable coherent lower previsions have a very simple representation, 
in terms of sampling without replacement. To see how this comes about, consider any 
xG X^ . The so-called (permutation) invariant atom 

[x] := {ttx : tt G Vn] 

is then the smallest non-empty subset of X^ that contains x and is invariant under all 
permutations tt in Vn- We shall denote the set of permutation invariant atoms of X^ 
by ■ This constitutes a partition of the set X^ . We can characterise these invariant 
atoms using the counting maps : X^ — > No defined for all x in A' in such a way that 

Ti^(z) = ri^(zi, . . . , z^) |{fc e {1, . . . , iV} : Zfc = 

is the number of components of the A^-tuple z that assume the value x. Here, |^| denotes 
the number of elements in a finite set A and Nq is the set of all non-negative integers 
(including zero). We shall denote by T-^ the vector- valued map from X^ to Nff whose 
component maps are the Pj^, x & X. actually assumes values in the set of count 
vectors 

A/"^ |m e : ^ m^, = AT j . 
^ xex ' 

The counting map can be interpreted as a bijection (one-to-one and onto) between 
the set of invariant atoms and the set of count vectors A/"^, and we can identify 
any invariant atom [z] by the count vector m = T^(z) of any (and therefore all) of 
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its elements. We therefore also denote this atom by [m]. Clearly y £ [m] if and only if 
T^(y) = m. The number of elements i^{m) in any invariant atom [m] is given by 



z/(m) 



If the joint random variable X = {Xi, . . . ,Xn) assumes the value z in , then the 
corresponding count vector assumes the value T^(z) in Af^ . This means that we can 
see T^(X) = T^(Xi, . . . , X^) as a random variable in J\f^ . If the available information 
about the values that X assumes in is given by the coherent exchangeable lower 
prevision Pf^ (the distribution of X), then the corresponding uncertainty model for the 
values that T''^(X) assumes in J\f^ is given by the coherent induced lower prevision 
on /:(7V^) (the distribution of T^(X)), given by ~ 

Q^(/i) :=P^(/ioT^) = P^( J2 h{ni)I[^A for all gambles /i on A/"^ . (1) 

We now come to a theorem showing that, conversely, any exchangeable coherent lower 
prevision is in fact completely determined by the corresponding distribution of 
the count vectors, also called its count distribution. 

Consider an urn with N balls of different types, where the different types are charac- 
terised by the elements x of the set X. Suppose the composition of the urn is given by 
the count vector m £ A/"^, meaning that rrix balls are of type x for x G X. We are now 
going to subsequently select (in a random way) A'' balls from the urn, without replacing 
them. It follows that for any gamble / on X^ , its (precise) prevision (or expectation) is 
given by 



MuHy^'iflm):^^ ^ /(z). 



ze[m] 

The linear prevision MuHy^ {-Im) is the one associated with a multiple hypergeometric 
distribution ([13], Chapter 39), whence the notation. For any permutation tt of {1, . . . , N}, 

Mwir2/^(^/|m) = -^ ^ /(7rz) = -^ ^ /(z) = Mu7/j/^(/|m) 

since tt^^z e [m] if and only if z G [m]. This means that the linear prevision MuHy'^ {■\in) 
is exchangeable. The following theorem establishes an even stronger result. It is an im- 
mediate consequence of a much more general representation result by de Cooman and 
Miranda [4], Theorem 30. 

Theorem 2 (Representation theorem for finite sequences of exchangeable 
variables). Let N > 1. A coherent lower prevision on C{X^) is exchangeable if 
and only if it there is some coherent lower prevision Q on /:(7V^) such that P"(/) = 
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Q{MuHy (/[•)) for all gambles f on . If a coherent lower prevision P_ on 
is exchangeable, then the corresponding Q is given by equation (1). 

This theorem impUes that any collection of N exchangeable random variables in X 
can be seen as the result of N random draws without replacement from an urn with N 
balls whose types are characterised by the elements x oi X and whose composition m is 
unknown, but for which the available information about the composition is modelled by 
a coherent lower prevision on C{M^)} 

That exchangeable linear previsions can be interpreted in terms of sampling without re- 
placement from an urn with unknown composition is of course well known and essentially 
goes back to de Finetti's work on exchangeability [1, 5]. Heath and Sudderth [10] give a 
simple proof for variables that may assume two values. However, we believe our proof of 
the much more general representation result ([4], Theorem 30), to be conceptually even 
simpler than Heath and Sudderth's proof. 

4. Exchangeable sequences 
4.1. Definitions 

Consider a countable sequence Xi, . . . ,Xn, ... of random variables taking values in the 
same non-empty set X. This sequence is called exchangeable if any finite collection of 
random variables taken from this sequence is exchangeable. 

We can also consider the exchangeable sequence as a single random variable X as- 
suming values in the set X^, where N is the set of natural numbers (positive integers, 
without zero). Its possible values x are sequences xi, . . . ,a;„, . . . of elements of X or, in 
other words, maps from N to A". We can model the available information about the value 
that X assumes in X^ by a coherent lower prevision on C{X^), called the distribution 
of the exchangeable random sequence X. 

The random sequence X, or its distribution P^, is clearly exchangeable if and only if 
all of its X"' -marginals P" are exchangeable for n > 1. These marginals P" on £(<¥") are 
defined as follows: for any gamble / on A"", P"(/) :=P^(/), where / is the cylindrical 
extension of / to X^ defined by /(x) f{xi, . . . , x„) for all x = (xi, . . . , x„, Xn+i, ■ ■ ■) 
in X^ . In addition, the family of exchangeable coherent lower previsions P", n > 1, 
satisfies the Hime consistency^ requirement 

£"(/)=Z"+'(/) (2) 

for all n > 1, k >0 and all gambles / on X", where / now denotes the cylindrical 
extension of / to : P" should be the A:'"-marginal of any p"+''' . 

It follows at once that any finite collection of n > 1 random variables taken from such 
an exchangeable sequence has the same distribution as the first n variables Xi, . . . 
which is the exchangeable coherent lower prevision P" on £(<¥"). 



^Walley [19], Chapter 9, also mentions this result for exchangeable coherent lower previsions. 
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Conversely, suppose we have a collection of exchangeable coherent lower previsions P" 
on C{X"), n > 1, that satisfy the time consistency requirement (2). Then any coherent 
lower prevision on C{X^^) that has A:'"-marginals P" is exchangeable. The smallest, 
or most conservative, such (exchangeable) coherent lower prevision is given by 

P«(/):-supP"(proj (/))= lim P"(projJ/)), 



where / is any gamble on and its lower projection proj (/) on A"" is the gamble 
on A"" that is defined by proj^(/)(x) ~ i-af^i,x«:zk=xk,k=i,...,n fi'^') for all x e A""; see de 
Cooman and Miranda [3], Section 5, for more details. 

4.2. Time consistency of the count distributions 

It is of crucial interest for what follows to determine the consequences of the time con- 
sistency requirement (2) on the marginals P" for the corresponding family Q", n > 1, of 
distributions of the count vectors T"(Xi, . . . ,X„). Consider, therefore, n>l, k>0 and 
any gamble h on TV". If we let f ho T", then 

where the first equality follows from equation (1), the second from equation (2) and the 
last from Theorem 2. Now, for any m' in A/'"+''' and any z' = (z,y) in X"+'' = X" x X'', 
we have that T"+'=^(z') = T"(z) + T'=(y) and therefore 

taking into account that MuHy^ {f\m.) = h{m) and that ;y(m' — m) is zero unless m < m'. 
So we see that time consistency is equivalent to 

0"W = Q-'^(E '^^"7^;^"^ Mm)) (3) 

for all 71 > 1, fc > and /i e L{M'^). 

5. A representation theorem for exchangeable 
sequences 

De Finetti [5, 7] has proven a representation result for exchangeable sequences with lin- 
ear previsions that generalises Theorem 2 and where multinomial distributions take over 
the role that the multiple hypcrgcomctric ones play for finite collections of exchangeable 
variables. One simple and intuitive way (see also [7], p. 218) to understand why the rep- 
resentation result can be thus extended from finite collections to countable sequences, is 
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based on the fact that the multinomial distribution can be seen as a limit of multiple 
hypergeometric ones ([13], Chapter 39). This is also the central idea behind Heath and 
Sudderth [10] simple proof of this representation result in the case of variables that may 
only assume two possible values. 

However, there is another, arguably even simpler, approach to proving the same results, 
which we present here. It also works for exchangeability in the context of coherent lower 
previsions. And, as we shall have occasion to explain further on, it has the additional 
advantage of clearly indicating what the 'representation' is and where it is uniquely 
defined. 

5.1. Multinomial processes are exchangeable 

Consider a sequence of i.i.d. random variables Yi, . . . , y„, . . . with common probability 
mass function 9: the probability that y„ = x is 9^ for x Cz X. Observe that 9 is an 
element of the X -simplex 

Y. = I9 eR^ : {Wx e X) {9^ > 0) and ^ 6^., = l|. 

Then, for any n>l and any z in X", the probability that (Yi, . . . , y„) is equal to z is 
given by OxeA' which yields the multinomial mass function ([13], Chapter 35). As 

a result, we have for any gamble / on X"' that its corresponding (multinomial) prevision 
(expectation) is given by 

Mn"{f\9) = GoMn^{MuHy\f\-)\9), (4) 
where we defined the (count multinomial) linear prevision CoMn'^{-\0) on C{N'^) by 

CoMn^\g\e) = ^ g{m)v(rn) [] C% (5) 

where g is any gamble on A/"". The corresponding probability mass for any count vec- 
tor m, 

CoMn"({m}|6/) = i^(m) J| B^{9) (6) 

xex 

is the probability of observing some value z for (Yi,...,y„) whose count vector is 
m. The polynomial function B„i on the A'-simplcx is called a (multivariate) Bern- 
stein (basis) polynomial. The set {Bm : m S A/""} of all Bernstein (basis) polynomials 
of fixed degree n forms a basis for the linear space of all (multivariate) polynomi- 
als on S whose degree is at most 71, hence their name. If we have a polynomial p 
of degree m, this means that for any n > m, p has a unique (Bernstein) decompo- 
sition bp e C{M"') such that p = X^meAA" (™)^™- combine this with equa- 
tions (5) and (6), we find that b^ is the unique gamble on TV" such that CoMn'^{bp\ 
■)=P. 
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We deduce from equation (4) and Theorem 2 that the linear prevision Mn"(-|0) on 
C{X") is exchangeable and that CoMn"{-\6) is the corresponding distribution for the 
corresponding count vectors T"(Yi, . . . , y„). Therefore, the sequence of i.i.d. random 
variables Yi, . . . , y„, . . . is exchangeable. 

5.2. A representation theorem 

Consider the linear subspacc of C^S), 

V(E) ~{CoMn"{g\-) : n > l,.g e /:(AA")} = {Mn"(/|-) : n > 1, / £ /:(;f")}, 
each of whose elements is a polynomial function on the A'-simplcx, 

CoMn-{g\e)= ^ g(m)Km) J] - E ffM^-W' 

and is actually a linear combination of Bernstein basis polynomials with coefficients 
g{m). So, V(S) is the linear space spanned by all Bernstein basis polynomials and is 
therefore the set of all polynomials on the A'-simplcx S. 

Now, if R is any coherent lower prevision on then it is easy to see that the family 

of coherent lower previsions P" , n > 1 . defined by 

P"(/)=MMn"(/|.)), feCiX-), (7) 

is still exchangeable and time consistent, and that the corresponding count distributions 
are 

Q"(g)=P((7oMn"(.9|.)), 5e/:(AA"). (8) 

Here, we are going to show that a converse result also holds: for any time-consistent 
family of exchangeable coherent lower previsions P", n > 1, there is a coherent lower 
prevision R on V(S) such that equation (7), or its reformulation for counts (8), holds. 
We call such an P a representation, or representing coherent lower prevision, for the 
family P". Of course, any representing P, if it exists, is uniquely determined on V(S]). 

So consider a family of coherent lower previsions Q" on £(A/'"), n > 1, that are time 
consistent. It suffices to find an R such that (8) holds because the corresponding ex- 
changeable lower previsions P" on C{X") arc then uniquely determined by Theorem 2, 
and automatically satisfy the condition (7). Our proposal is to define the functional P 
on the set V{T,) as follows: consider any element p ofV(T.). Then, by definition, there 
is some n>l and a corresponding unique G C{J\f") such that p = CoMn"(6^|-). We 
then let R{p) :=Q"(6p. 

The first thing to check is whether this definition is consistent. 

Lemma 3. Let p be a polynomial of degree m and let ni,7i2 > rn. Then (5"^(fepi) = 
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Proof. We may assume without loss of generahty that 712 > ni. The Bernstein decompo- 
sitions and 6^^ are then related by Zhou's formula [see equation (10) in the Appendix]: 

Consequently, by the time consistency requirement (3), = Q^^ib"^^^). □ 

Lemma 4. R is a coherent lower prevision on the linear space V(S). 

Proof. We show that R satisfies the necessary and sufficient conditions (P1)-(P3) for 
coherence of a lower prevision on a linear space. 

We first prove that (PI) is satisfied. Consider any p G V(S). Let m be the degree of 
p. We must show that R{p) > minp. We find that R{p) = (5"(6p) > min6p for aU n > m 
because of the coherence of Q". However, equation (11) in the Appendix tells us that 
min6p | minp, so we indeed have R{p) > minp. 

Next, consider any p in V(S) and any real A > 0. Consider any n that is not smaller 
than the degree of p. Since it is obvious that = A6p, we get 

^(Ap) = Q"(&Ap) = Q"(A&;) = AQ"(6p = XRip), 

where the third equality follows from the coherence (non-negative homogeneity) of the 
count lower prevision Q". This tells us that R satisfies (P2). 

Finally, consider p and q in V(E), and any n that is not smaller than the maximum of 
the degrees of p and q. Since it is obvious that b^^^ = bp +bg, we get 

Rip + q) = Q'\b;+,)^Q"ib; + b'',)>Q'\b;) + Q-{b^)=Rip) +R{q), 

where the inequality follows from the superadditivity of Q". This tells us that R also 
satisfies (P3) and, as a consequence, it is coherent. □ 

We can summarise the argument above as follows. 

Theorem 5 (Representation theorem for exchangeable sequences). Given a 
time- consistent family of exchangeable coherent lower previsions P" on C{X^), n> 1, 
there is a unique coherent lower prevision R on the linear space V(E) of all polynomial 
gambles on the X -simplex such that for all n > 1, all f ^ £(<¥") and all g G C{J\f"), 

Pr{f) = R{Mn''{f\-)) and Q''{g) ^ R{CoMn"{g\-)). (9) 

Hence, the belief model governing any countable exchangeable sequence in X can be 
completely characterised by a coherent lower prevision on the linear space of polynomial 
gambles on E. 

In the particular case where we have a time-consistent family of exchangeable linear 
previsions P" on £{X"),n >1,R will be a linear prevision R on the linear space V(S) of 
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all polynomial gambles on the A'-simplex. As such, it will be characterised by its values 
R{B„i) on the Bernstein basis polynomials B^, m £ A/"", n > 1, or on any other basis of 

V(S); 

It is a consequence of coherence that R is also uniquely determined on the set C(E) 
of all continuous gambles on the A'-simplex E: by the Stone- Weierstrass theorem, any 
such gamble is the uniform limit of some sequence of polynomial gambles and coherence 
implies that the lower prevision of a uniform limit is the limit of the lower previsions. 

This unicity result cannot be extended to more general (discontinuous) types of gam- 
bles: the coherent lower prevision R is not uniquely determined on the set of all gambles 
£(E) on the simplex and there may be different coherent lower previsions R^ and i?^ on 
£(E) satisfying equation (9). But any such lower previsions will agree on the class V(E) 
of polynomial gambles, which is the class of gambles we need in order to characterise the 
exchangeable sequence. 

We now investigate the meaning of the representing lower prevision R a bit further. 
Consider the sequence of so-called frequency random variables F„ := T"(Xi, . . . ,X„)/n 
corresponding to an exchangeable sequence of random variables Xi, . . . ,X„, . . . and as- 
suming values in the A'-simplex E. The distribution i^F„ F„ is given by 

Ppjh) ■■=Q'"(ho^^ =^(^CoMn"(^/ioi|.^^, heC{^), 

because we know that Q" is the distribution of T"(Xi, . . . , X„), and also taking into 
account Theorem 5 for the last equality. Now, 



CoMn 



neN'" 



is the Bernstein approximant or approximating Bernstein polynomial of degree n for the 
gamble h and it is a known result (see [9], Section VII. 2, or [11], Section 2) that the 
sequence of approximating Bernstein polynomials CoMn"'{ho converges uniformly 
to ft, as n — > oo if is continuous. So, because R is uniquely defined and uniformly 
continuous on the set C(E), we find the following result. 

Theorem 6. For all continuous gambles h on E, we have that 

lim Pp (h) = Rih) 

n — *oo " 

or, in other words, the sequence of distributions converges pointwise to R on C(E) 
and, in this specific sense, the sample frequencies F„ converge in distribution. 



6. Conclusions 



We have shown that the notion of exchangeability has a natural place in the theory of 
coherent lower previsions. Indeed, with our distinctive approach using Bernstein polyno- 
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mials, and gambles rather than events, it seems fairly natural and easy to derive repre- 
sentation theorems directly for coherent lower previsions and to derive the corresponding 
results for precise probabilities (linear previsions) as special cases. 

Interesting results can also be obtained in a context of predictive inference, where a 
coherent exchangeable lower prevision for n-\-k variables is updated with the information 
that the first n variables have been observed to assume certain values. For a fairly detailed 
discussion of these issues, we refer to de Cooman and Miranda [4], Section 9.3. 



Appendix: Multivariate Bernstein polynomials 

To any n>Q and m £ A/"" , there corresponds a Bernstein (basis) polynomial of degree n 
on E, given by -Bm(^) = '^(m) OajsA' ' ^ G 5]. These polynomials have a number of 
very interesting properties (see, for instance, [17], Chapters 10 and 11): 

(Bl) they are non-negative, and strictly positive in the interior of S; 
(B2) the set {i?m : m e A/""} of all Bernstein polynomials of fixed degree n forms a 
basis for the linear space of all polynomials whose degree is at most n. 

Hence, for any polynomial p of degree m, there is a unique gamble on A/"" such that 

K(^)Bm^CoMn"{b;\-). 

This tells us that each p{9) is a convex combination of the Bernstein coefficients &p(m), 
m e A/"" , so min bp < minp <p{6) < maxp < max . It also follows that for all A; > and 
aU fj. in7V"+^ 

E -("-^-(^"-^ ."(m). (10) 

This is Zhou's formula (see [17], Section 11.9). Moreover, since for any polynomial^ on E 
of degree m, the bp converge uniformly to p as n — > oo (see, for instance, [18]), it follows 
that 

lim [min6", max6"] = [minp, maxp] —p{T,). (11) 
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